{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Imports"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Read a DataFrame From an Excel File"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Pandas has a read_excel function which can be used to create a DataFrame from an Excel file. "
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Stock | \n",
" Price | \n",
" Date | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" MSFT | \n",
" 100 | \n",
" 2019-01-01 | \n",
"
\n",
" \n",
" 1 | \n",
" MSFT | \n",
" 110 | \n",
" 2019-02-01 | \n",
"
\n",
" \n",
" 2 | \n",
" MSFT | \n",
" 105 | \n",
" 2019-03-01 | \n",
"
\n",
" \n",
" 3 | \n",
" MSFT | \n",
" 112 | \n",
" 2019-04-01 | \n",
"
\n",
" \n",
" 4 | \n",
" AAPL | \n",
" 500 | \n",
" 2019-01-01 | \n",
"
\n",
" \n",
" 5 | \n",
" AAPL | \n",
" 512 | \n",
" 2019-02-01 | \n",
"
\n",
" \n",
" 6 | \n",
" AAPL | \n",
" 482 | \n",
" 2019-03-01 | \n",
"
\n",
" \n",
" 7 | \n",
" AAPL | \n",
" 525 | \n",
" 2019-04-01 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Stock Price Date\n",
"0 MSFT 100 2019-01-01\n",
"1 MSFT 110 2019-02-01\n",
"2 MSFT 105 2019-03-01\n",
"3 MSFT 112 2019-04-01\n",
"4 AAPL 500 2019-01-01\n",
"5 AAPL 512 2019-02-01\n",
"6 AAPL 482 2019-03-01\n",
"7 AAPL 525 2019-04-01"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df = pd.read_excel('Stock Data.xlsx')\n",
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"By default, it will load the first sheet. However, we can pass a name of a sheet to get that sheet instead."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Stock | \n",
" Price | \n",
" Date | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" AMZN | \n",
" 30 | \n",
" 2019-01-01 | \n",
"
\n",
" \n",
" 1 | \n",
" AMZN | \n",
" 35 | \n",
" 2019-02-01 | \n",
"
\n",
" \n",
" 2 | \n",
" AMZN | \n",
" 28 | \n",
" 2019-03-01 | \n",
"
\n",
" \n",
" 3 | \n",
" AMZN | \n",
" 32 | \n",
" 2019-04-01 | \n",
"
\n",
" \n",
" 4 | \n",
" FB | \n",
" 900 | \n",
" 2019-01-01 | \n",
"
\n",
" \n",
" 5 | \n",
" FB | \n",
" 910 | \n",
" 2019-02-01 | \n",
"
\n",
" \n",
" 6 | \n",
" FB | \n",
" 920 | \n",
" 2019-03-01 | \n",
"
\n",
" \n",
" 7 | \n",
" FB | \n",
" 930 | \n",
" 2019-04-01 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Stock Price Date\n",
"0 AMZN 30 2019-01-01\n",
"1 AMZN 35 2019-02-01\n",
"2 AMZN 28 2019-03-01\n",
"3 AMZN 32 2019-04-01\n",
"4 FB 900 2019-01-01\n",
"5 FB 910 2019-02-01\n",
"6 FB 920 2019-03-01\n",
"7 FB 930 2019-04-01"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df = pd.read_excel('Stock Data.xlsx', sheet_name='The Second Sheet')\n",
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Write a DataFrame to an Excel File"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## WARNING\n",
"\n",
"Do not output to an existing workbook, as the original workbook will be overwritten. There is no undo!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Pandas provides `DataFrame.to_excel` to create Excel workbooks from DataFrames. But first, let's modify the data to ensure we're outputting the current `DataFrame`. We'll go ahead and add a column."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Stock | \n",
" Price | \n",
" Date | \n",
" Custom Column | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" AMZN | \n",
" 30 | \n",
" 2019-01-01 | \n",
" stuff | \n",
"
\n",
" \n",
" 1 | \n",
" AMZN | \n",
" 35 | \n",
" 2019-02-01 | \n",
" stuff | \n",
"
\n",
" \n",
" 2 | \n",
" AMZN | \n",
" 28 | \n",
" 2019-03-01 | \n",
" stuff | \n",
"
\n",
" \n",
" 3 | \n",
" AMZN | \n",
" 32 | \n",
" 2019-04-01 | \n",
" stuff | \n",
"
\n",
" \n",
" 4 | \n",
" FB | \n",
" 900 | \n",
" 2019-01-01 | \n",
" stuff | \n",
"
\n",
" \n",
" 5 | \n",
" FB | \n",
" 910 | \n",
" 2019-02-01 | \n",
" stuff | \n",
"
\n",
" \n",
" 6 | \n",
" FB | \n",
" 920 | \n",
" 2019-03-01 | \n",
" stuff | \n",
"
\n",
" \n",
" 7 | \n",
" FB | \n",
" 930 | \n",
" 2019-04-01 | \n",
" stuff | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Stock Price Date Custom Column\n",
"0 AMZN 30 2019-01-01 stuff\n",
"1 AMZN 35 2019-02-01 stuff\n",
"2 AMZN 28 2019-03-01 stuff\n",
"3 AMZN 32 2019-04-01 stuff\n",
"4 FB 900 2019-01-01 stuff\n",
"5 FB 910 2019-02-01 stuff\n",
"6 FB 920 2019-03-01 stuff\n",
"7 FB 930 2019-04-01 stuff"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df['Custom Column'] = 'stuff'\n",
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now let's write this to a workbook. You just need to specify what to call the new workbook. You must end it with `.xlsx` for it to be a valid workbook."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"df.to_excel('New Book.xlsx')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"After running the contents to here, you should now see `New Book.xlsx` in the same folder as this notebook. Open it and verify that the contents match the `DataFrame`."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You'll notice that the created workbook has the 0, 1, 2... index alongside it, which we don't want. Just pass `index=False` to get rid of it. Make sure you close the workbook before trying to write to it again, or you'll get an error."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"df.to_excel('New Book.xlsx', index=False)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now the data looks like we want. The only thing is it's currently outputting to a sheet named \"Sheet1\", but we may want a better name. You can provide it here."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"df.to_excel('New Book.xlsx', index=False, sheet_name='Stock Data')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now it has a proper sheet name and the data displays properly."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Advanced"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The following will not be covered in class, but may be useful for you."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Writing to Multiple Sheets"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"While in general, writing to a workbook replaces it, there is a way to write multiple sheets to a workbook, through the `ExcelWriter`. But first let's create a second `DataFrame` to write."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Stock | \n",
" Price | \n",
" Date | \n",
" Custom Column | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" AMZN | \n",
" 30 | \n",
" 2019-01-01 | \n",
" other stuff | \n",
"
\n",
" \n",
" 1 | \n",
" AMZN | \n",
" 35 | \n",
" 2019-02-01 | \n",
" other stuff | \n",
"
\n",
" \n",
" 2 | \n",
" AMZN | \n",
" 28 | \n",
" 2019-03-01 | \n",
" other stuff | \n",
"
\n",
" \n",
" 3 | \n",
" AMZN | \n",
" 32 | \n",
" 2019-04-01 | \n",
" other stuff | \n",
"
\n",
" \n",
" 4 | \n",
" FB | \n",
" 900 | \n",
" 2019-01-01 | \n",
" other stuff | \n",
"
\n",
" \n",
" 5 | \n",
" FB | \n",
" 910 | \n",
" 2019-02-01 | \n",
" other stuff | \n",
"
\n",
" \n",
" 6 | \n",
" FB | \n",
" 920 | \n",
" 2019-03-01 | \n",
" other stuff | \n",
"
\n",
" \n",
" 7 | \n",
" FB | \n",
" 930 | \n",
" 2019-04-01 | \n",
" other stuff | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Stock Price Date Custom Column\n",
"0 AMZN 30 2019-01-01 other stuff\n",
"1 AMZN 35 2019-02-01 other stuff\n",
"2 AMZN 28 2019-03-01 other stuff\n",
"3 AMZN 32 2019-04-01 other stuff\n",
"4 FB 900 2019-01-01 other stuff\n",
"5 FB 910 2019-02-01 other stuff\n",
"6 FB 920 2019-03-01 other stuff\n",
"7 FB 930 2019-04-01 other stuff"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df2 = df.copy()\n",
"df2['Custom Column'] = 'other stuff'\n",
"df2"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that we have two `DataFrame`s, let's write them to the same workbook. Here we must use the `with` syntax and instead of passing the file path to each `to_excel` command, we will pass the writer."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"with pd.ExcelWriter('New Book.xlsx') as writer:\n",
" df.to_excel(writer, sheet_name='First Df', index=False)\n",
" df2.to_excel(writer, sheet_name='Second Df', index=False)\n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now you should see both `DataFrame`s as sheets in the workbook."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
}
},
"nbformat": 4,
"nbformat_minor": 4
}